Grok-1 Overview

Grok-1 is a large language model (LLM) with 314 billion parameters. It uses a mixture-of-experts (MoE) approach, meaning that only 25% of its weights are activated for each token during processing. This model was developed by xAI and has an open release of its base model weights and architecture.

The model was trained with data up until October 2023, but it’s important to note that Grok-1 is still in its raw form and hasn't been fine-tuned for specific tasks like chatbots. It is available under the Apache 2.0 license.

Performance Highlights

Grok-1 has shown strong performance in reasoning and coding tasks. For instance, it scored 63.2% on the HumanEval coding test and 73% on the MMLU benchmark. While it outperforms models like ChatGPT-3.5 and Inflection-1, it does not quite match the capabilities of improved models such as GPT-4.

In a specific test, Grok-1 received a C grade (59%) on the Hungarian national high school mathematics finals, while GPT-4 achieved a B (68%).

You can find the model here [Grok-1 GitHub](httpsgithub.comxai-orggrok-1).

Because of its large size (314 billion parameters), xAI suggests using a multi-GPU setup for testing Grok-1.